Though one predictor may contain some useful information about the response variable, it doesn’t tell us everything, i.e. predictor X1 may not be the only factor that might help us predict the response variable. In the hopes of improving our posterior model of predictions, multiple predictors might be incorporated in the same model to incorporate more information. Essentially, using more than one predictors together, instead of using one predictor alone, might produce more precise posterior predictions of the response variable, which can be evaluated through posterior predictive checks.
We can we can think of Ford category as the baseline or reference level of the make of the car: When X1=X2=X3=0, it automatically changes to the scenario for Ford car, so we do not need a indicator term for the Ford category.
Subarus coefficient β2 represents the typical difference in a Subarus car’s miles per gallon in a city (X2 = 1) versus a non-Subarus car’s miles per gallon in a city (X2 = 0). Thus, β2 technically measures the typical change in a car’s miles per gallon in a city for a 1-unit increase in X2, it’s just that this increase from 0 to 1 corresponds to a change in the make of the car category.
Intercept coefficient β0 represents the typical Ford car’s miles per gallon in a city (X1=X2=X3=0).
Age coefficient β1 is the common slope of the Mr. Stripey and Roma lines. Thus, β1 represents the typical change in a tomato’s weight in grams with every 1-day increase in the tomato’s age for both Mr. Stripey and Roma tomatoes.
Roma coefficient β2 represents the typical difference in a Roma tomato’s weight in grams (X2 = 1) versus a Mr. Stripey tomato’s weight in grams (X2 = 0). Thus, β2 technically measures the typical change in a tomato’s weight in grams for a 1-unit increase in X2, it’s just that this increase from 0 to 1 corresponds to a change in the tomato’s type category.
If X1 and X2 interact, a tomato’s weight in grams increases with age for both Mr. Stripey and Roma tomatoes, yet the rate of this increase differs, being more rapid on one type of tomatoes.
Interaction coefficient β3 captures the difference in slopes, and thus how the change in a tomato’s weight in grams with age differs between the two types of tomatoes - Mr. Stripey and Roma tomatoes.
Consider modeling generalized trust (Y) by one’s marriage status (X2, being unmarried as the reference level) and homophily preference (X3). Thus the model can be denoted as Yi|β0,β1,β2,β3,σ ∼ N(μi,σ) with μi = β0 + β1X2 + β2X3 + β3X2X3.
knitr::include_graphics("image/interaction_term.png")
Consider modeling generalized trust (Y) by one’s gender (X4, being women as the reference level) and educational length (X5). Thus the model can be denoted as Yi|β0,β4,β5,σ ∼ N(μi,σ) with μi = β0 + β4X4 + β5X5.
knitr::include_graphics("image/no_interaction_term.png")
Overly simple models with few or no predictors tend to have high bias (estimate of the model mean μ can be biased, or far from the actual observed relationship in the data). To prevent our model from being overly simple and rigid, adding predictors to models may be beneficial in that it allows us to incorporate more flexibility into our model with the inclusion of more predictors to make it better follow the trends in the observed relationship in the data and decrease the bias. By including more than one predictor, multivariable regression is likely to improve our predictions and understanding of the response variable.
Overly complicated models with lots of predictors tend to have low bias but high variability (low stability). Since such multivariate model is structured to pick up tiny, sample-specific details in the relationship, it has low bias but high variance from sample to sample. Utilizing this highly variable model would have two serious consequences. First, the results would be unstable – different data might produce very different model estimates, and hence conclusions about the relationship. Second, the results would be overfit to our sample data and undermine the generalization. As a result, this model wouldn’t do a good job of predicting unobserved new data. Thus, it is beneficial to identify and remove redundant and irrelevant predictors that have little impact on the response variable from models.
I might add the categorical variable gender (categories include female and male with male being the reference level) to this model to improve the predictions of shoe size since biologically boys tend to grow larger feet than girls of the same age.
I might remove the indicator of whether the child knows how to swim (X2) from this model to improve it since I don’t see that the variable has a considerable impact on the response variable: whether the child knows how to swim has little to do with the child’s shoe size.
A good model would be fair in terms of its data collection, data collector(s), data analysis purpose(s), and the resulting impacts on individuals and society. There shouldn’t be biases baked into the data analysis. All models are wrong. Mainly, statistical models are idealistic representations of more complex realities. Even so, good statistical models can still be useful and inform our understanding of the world’s complexities. A good Bayesian model not only produces posterior predicted datasets with features similar to the original observed data, but also can be used to accurately predict the outcomes of new data. That is, a good model will generalize beyond the data we used to build it. A good model also strikes a balance between bias and variability and thus enjoys relatively low bias and low variability. It should recognize the pattern and neglect the noise (not reproduce the random errors) in the data.
A bad model is not fair since it may be ethically dubious or biased about the data collection or analysis. A bad model is wrong for it fails to produce posterior simulated data that share similar features wit the observed data. The extent to which a bad model matches reality is so low that it is considered unacceptable in a particular context. The assumptions behind are not reasonable. A bad model also has poor posterior predictive accuracy of new data or even the sample data points that we used to build this model (cannot well generalize to new data beyond our original sample). A bad model can be either underfit (having few or no predictors and not fitting the training data very well) and have low variance from sample to sample but high bias for being not complex enough to follow the trends in the relationship from the data, or overfit (having lots of predictors and picking up too much tiny, sample-specific noise in the relationship from the training data) and have less bias but high variance from sample to sample that its results and conclusion about the relationship is unstable (estimates of the model significantly differ depending upon what data is used) and it gives bad predictions for new data (fail to recognize the pattern in the new data and reproduce the random errors in the original data).
In this chapter, we learn that we can evaluate a model’s predictive accuracy using visualizations: figures generated by posterior predictive check functions (e.g. ppc_intervals, ppc_violin_grouped, ppc_intervals_grouped) illustrate how much the observed data points fall within the bounds of their predictive models based on different predictors (alone or combined) to help us determine whether the posterior predictive models accurately capture the observed behavior in data values. But we can’t actually visualize the posterior predictive quality of every model; even if we could, it can be difficult to ascertain the comparative predictive quality of our models from visuals alone; and the visual evidence alone only illustrate how well our models predict the same data points we used to build them, and thus might give overly optimistic assessments of predictive accuracy. We can also numerically assess the posterior predictive quality of a model using cross-validation: running k-fold cross-validation through prediction_summary_cv function gives cross-validated posterior predictive summaries for models (including metrics such as median absolute error, scaled median absolute error, within_50 and within_95 statistics) that provide insights into what appropriate model to adopt for different purposes. Expected log-predictive densities (ELPD) numerical summary can also help evaluate models’ predictive accuracy. We can calculate the ELPD for each model through the loo function: the larger the expected logged posterior predictive pdf across a new set of data points, the more accurate the posterior predictions of the new data. We can also compare the posterior predictive accuracy of different models using loo_compare function, though it does not provide interpretable metrics for the posterior predictive accuracy of any single mode, it lists the models in order from best to worst, or from the highest ELPD to the lowest and the extent to which each model compares to the best model with estimated differences and the corresponding standard errors. We can thus determine if the posterior predictive accuracy of one model is significantly greater than that of another model. Put together, we can see if the results of the evaluation techniques above are consistent with each other to gain a more solid evaluation of the models.
Bias–variance tradeoff indicates that an increase in a model’s ability to minimize the difference between its simulated data and the actual values which it tries to predict can bring an undesirable increase in the variance of the model’s predictions across datasets which makes the model poor at generalizing on new data it hasn’t been trained with, and reducing the variance of the predictions across samples can also bring an undesirable increase in the bias in following the trends in the relationship from the observed data.
It is important to notice the tradeoff since it reminds us that though adding relevant predictions to a model can make a model less biased, we shouldn’t be greedy to incorporate unnecessary predictors that will not improve predictions considerably. Our goal is not solely to exactly represent the observed data (which is unnecessary effort) but to build a model with appropriate predictors that strikes a good balance, enjoying relatively low bias and low variability (high stability) that could recognize the pattern in the observed data and generalize on the data which it hasn’t seen before, thus make accurate predictions. Here, appropriate simplification of the data is what we want to improve a model’s stability from sample to sample; otherwise the model may learn too much from the data, and will only be good at tracing the observed data, but bad at predicting new data.
# Load some packages
library(bayesrules)
library(rstanarm)
## Loading required package: Rcpp
## This is rstanarm version 2.21.1
## - See https://mc-stan.org/rstanarm/articles/priors for changes to default priors!
## - Default priors may change, so it's safest to specify priors, even if equivalent to the defaults.
## - For execution on a local, multicore CPU with excess RAM we recommend calling
## options(mc.cores = parallel::detectCores())
library(bayesplot)
## This is bayesplot version 1.8.1
## - Online documentation and vignettes at mc-stan.org/bayesplot
## - bayesplot theme set to bayesplot::theme_default()
## * Does _not_ affect other ggplot2 plots
## * See ?bayesplot_theme_set for details on theme setting
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5 ✓ purrr 0.3.4
## ✓ tibble 3.1.5 ✓ dplyr 1.0.7
## ✓ tidyr 1.1.4 ✓ stringr 1.4.0
## ✓ readr 2.0.2 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(broom.mixed)
library(tidybayes)
# Load the data
data(penguins_bayes)
# Alternative penguin data
penguin_data <- penguins_bayes %>%
filter(species %in% c("Adelie", "Gentoo"))
penguins_bayes %>%
ggplot(aes(x = flipper_length_mm, y = body_mass_g)) +
geom_point(size = 0.2) +
geom_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 2 rows containing non-finite values (stat_smooth).
## Warning: Removed 2 rows containing missing values (geom_point).
ggplot(penguins_bayes, aes(x = body_mass_g, fill = species)) +
geom_density(alpha = 0.5)
## Warning: Removed 2 rows containing non-finite values (stat_density).
ggplot(penguins_bayes, aes(x = flipper_length_mm, fill = species)) +
geom_density(alpha = 0.5)
## Warning: Removed 2 rows containing non-finite values (stat_density).
The plot above indicates that a penguin’s body mass increases with its flipper length. A penguin’s weight in g tends to be around 4207.057 with an flipper length in mm 200.967, but could range from roughly 2500 to 6500. The variability from the observed patterns reflects some extent of uncertainty about the association, but we can see a penguin’s weight is moderately associated with its flipper length and exhibits certain variability at any given flipper length.
Density plots of the body mass of the three penguin species indicate that the typical Chinstrap penguins’ body mass tends to be a bit greater than Adelie penguins’ and typical Gentoo penguins’ body mass tends to be much greater than both Adelie and Chinstrap penguins’.
Density plots of the flipper lengths of the three penguin species indicate that typical Chinstrap penguins’ flipper lengths tends to be greater than Adelie penguins’ and typical Gentoo penguins’ flipper lengths tends to be much greater than both Adelie and Chinstrap penguins’.
mass_1 <- na.omit(penguins_bayes) %>%
summarise(mean_filpper_length = mean(flipper_length_mm),
mean_mass = mean(body_mass_g),
std_dev = sd(body_mass_g))
mass_1
## # A tibble: 1 × 3
## mean_filpper_length mean_mass std_dev
## <dbl> <dbl> <dbl>
## 1 201. 4207. 805.
mass_model_1 <- stan_glm(body_mass_g ~ flipper_length_mm + species,
data = penguins_bayes,
family = gaussian,
prior_intercept = normal(4000, 250),
prior = normal(0, 2.5, autoscale = TRUE),
prior_aux = exponential(1, autoscale = TRUE),
prior_PD = FALSE,
chains = 4, iter = 5000*2, seed = 616)
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.000124 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 1.24 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.183815 seconds (Warm-up)
## Chain 1: 0.262502 seconds (Sampling)
## Chain 1: 0.446317 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 5e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.188919 seconds (Warm-up)
## Chain 2: 0.266662 seconds (Sampling)
## Chain 2: 0.455581 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 8e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.08 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.193665 seconds (Warm-up)
## Chain 3: 0.279951 seconds (Sampling)
## Chain 3: 0.473616 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 4e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.182287 seconds (Warm-up)
## Chain 4: 0.328546 seconds (Sampling)
## Chain 4: 0.510833 seconds (Total)
## Chain 4:
# MCMC diagnostics
mcmc_trace(mass_model_1, size = 0.1)
mcmc_dens_overlay(mass_model_1)
mcmc_acf(mass_model_1)
neff_ratio(mass_model_1)
## (Intercept) flipper_length_mm speciesChinstrap speciesGentoo
## 0.55665 0.54605 0.69595 0.51785
## sigma
## 0.86930
rhat(mass_model_1)
## (Intercept) flipper_length_mm speciesChinstrap speciesGentoo
## 0.9999717 0.9999720 0.9999323 0.9999186
## sigma
## 0.9999274
There is no strong trend or flat lines in the trace plot. We see consistency across the four chains in the density plots and they exhibit similar features and produce nearly indistinguishable posterior approximations. The autocorrelations quickly drop off and are effectively 0 by lag 5, which further confirms that our Markov chains are mixing quickly, i.e., quickly moving around the range of posterior plausible values, and thus at least mimicking the independent sample. Effective sample size ratios mean the accuracy in using our 20,000 length Markov chain to approximate the posterior are roughly as great as if we had used 55.7% as many independent values of the intercept coefficient, 54.6% as many independent values of the flipper length coefficient, 69.6% as many independent values of the Chinstrap coefficient, 51.8% as many independent values of the Gentoo coefficient, and 86.9% as many independent values of the standard deviation parameter. The effective sample size ratios are satisfyingly high. The R-hat metric calculates the ratio between variability in simulated parameters across all four chains combined and the typical variability within any individual chain. The Rhat value of approximately 1 (<1.05) here reflect stability across the parallel chains. All this indicates that the Markov chains are mixing quickly and the simulation has stabilized.
# Posterior summary statistics
tidy(mass_model_1, effects = c("fixed", "aux"),
conf.int = TRUE, conf.level = 0.80)
## # A tibble: 6 × 5
## term estimate std.error conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) -4043. 584. -4791. -3289.
## 2 flipper_length_mm 40.8 3.07 36.8 44.7
## 3 speciesChinstrap -206. 58.8 -282. -131.
## 4 speciesGentoo 266. 95.9 143. 388.
## 5 sigma 376. 14.5 359. 395.
## 6 mean_PPD 4201. 28.5 4164. 4237.
With the flipper length coefficient’s posterior median value being 40.75417, the Chinstrap coefficient’s posterior median value being -206.43047, and the Gentoo coefficient’s posterior median value being 265.76648, the posterior median relationship is -4042.72885+40.75417X1-206.43047X2+265.76648*X3 (let X2,X3 be indicators for whether or not the penguin is Chinstrap or Gentoo respectively). That is, for every one mm increase in the flipper length, we expect the penguin’s weight to increase by roughly 40.75417 g; for every one-unit increase in X2 (this increase from 0 to 1 corresponds to a change in the species category, the penguin being Chinstrap or not), we expect the penguin’s weight to decrease by roughly 206.43047 g; for every one-unit increase in X3 (this increase from 0 to 1 corresponds to a change in the species category, the penguin being Gentoo or not), we expect the penguin’s weight to increase by roughly 265.76648 g.
The posterior median value of the σ parameter 376.16917 indicates that on average, we can expect the observed weight of a given species’ penguin with a given length of flipper to deviate 376.16917 g from the average weight of a similar species’ penguin with a similar length of flipper.
# Simulate a set of predictions
set.seed(6)
mass_prediction_1 <- posterior_predict(mass_model_1, newdata = data.frame(flipper_length_mm = 197, species = "Adelie"))
View(mass_prediction_1)
# Plot the posterior predictive models
mcmc_areas(mass_prediction_1) +
ggplot2::scale_y_discrete(labels = "Adelie") +
xlab("body_mass")
## Scale for 'y' is already present. Adding another scale for 'y', which will
## replace the existing scale.
The plot above indicates that the typical weight of Adelie penguins with a flipper length of 197 is around 4000 g, and is likely between 2000 and 6000 g.
mass_model_2 <- stan_glm(body_mass_g ~ flipper_length_mm + species + flipper_length_mm:species, data = penguins_bayes,
family = gaussian,
prior_intercept = normal(4000, 250),
prior = normal(0, 2.5, autoscale = TRUE),
prior_aux = exponential(1, autoscale = TRUE),
prior_PD = FALSE,
chains = 4, iter = 5000*2, seed = 616)
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 1.4e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.14 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 2.42089 seconds (Warm-up)
## Chain 1: 2.78708 seconds (Sampling)
## Chain 1: 5.20797 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 3e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 2.48694 seconds (Warm-up)
## Chain 2: 2.50473 seconds (Sampling)
## Chain 2: 4.99167 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 4e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 2.42576 seconds (Warm-up)
## Chain 3: 2.74225 seconds (Sampling)
## Chain 3: 5.16801 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 5e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 2.32724 seconds (Warm-up)
## Chain 4: 2.40583 seconds (Sampling)
## Chain 4: 4.73308 seconds (Total)
## Chain 4:
penguins_complete <- penguins_bayes %>%
select(flipper_length_mm, body_mass_g, species,
bill_length_mm, bill_depth_mm) %>%
na.omit()
# 50 simulated model lines
penguins_complete %>%
add_epred_draws(mass_model_2, ndraws = 50) %>%
ggplot(aes(x = flipper_length_mm, y = body_mass_g, color = species)) +
geom_line(aes(y = .epred, group = paste(species, .draw)), alpha = 0.1)
The 50 posterior simulated models capture our posterior uncertainty in these relationships and provide some posterior evidence that flipper length and species do interact: though body mass increases with flipper length at roughly the same rate for Adelie and Chinstrap penguins, body mass increases with flipper length at a more rapid rate for the Gentoo penguins than for the Adelie and Chinstrap penguins.
# Posterior summary statistics
tidy(mass_model_2, effects = c("fixed", "aux"))
## # A tibble: 8 × 3
## term estimate std.error
## <chr> <dbl> <dbl>
## 1 (Intercept) -2874. 821.
## 2 flipper_length_mm 34.6 4.32
## 3 speciesChinstrap -174. 1382.
## 4 speciesGentoo -3390. 1289.
## 5 flipper_length_mm:speciesChinstrap -0.0109 7.11
## 6 flipper_length_mm:speciesGentoo 17.6 6.28
## 7 sigma 371. 14.6
## 8 mean_PPD 4200. 28.1
Based on the summary, we have evidence that the interaction terms are necessary for this model since the the flipper length and species do interact: with Adelie penguins as the reference level of the variable, though the estimated interaction coefficient flipper_length_mm:speciesChinstrap is only slightly below 0 (we can further find its 80% posterior credible interval straddles 0 using posterior_interval function), suggesting that the association between body mass and flipper length for Chinstrap penguins is similar to that for Adelie penguins, the estimated interaction coefficient flipper_length_mm:speciesGentoo is way above 0 (we can further find its 80% posterior credible interval well above 0 using posterior_interval function), suggesting that the association between body mass and flipper length is significantly more positive for Gentoo penguins than for Adelie penguins.
posterior_interval(mass_model_2, prob = 0.80,
pars = c("flipper_length_mm:speciesChinstrap", "flipper_length_mm:speciesGentoo"))
## 10% 90%
## flipper_length_mm:speciesChinstrap -9.191961 8.994679
## flipper_length_mm:speciesGentoo 9.715937 25.508503
mass_model_3 <- stan_glm(body_mass_g ~ flipper_length_mm + bill_length_mm + bill_depth_mm, data = penguins_bayes,
family = gaussian,
prior_intercept = normal(4000, 250),
prior = normal(0, 2.5, autoscale = TRUE),
prior_aux = exponential(1, autoscale = TRUE),
prior_PD = FALSE,
chains = 4, iter = 5000*2, seed = 616)
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 1.3e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.171998 seconds (Warm-up)
## Chain 1: 0.249178 seconds (Sampling)
## Chain 1: 0.421176 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 4e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.172189 seconds (Warm-up)
## Chain 2: 0.234952 seconds (Sampling)
## Chain 2: 0.407141 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 6e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.06 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.153341 seconds (Warm-up)
## Chain 3: 0.252411 seconds (Sampling)
## Chain 3: 0.405752 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 4e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.180956 seconds (Warm-up)
## Chain 4: 0.238526 seconds (Sampling)
## Chain 4: 0.419482 seconds (Total)
## Chain 4:
# Posterior credible intervals
posterior_interval(mass_model_3, prob = 0.95)
## 2.5% 97.5%
## (Intercept) -7528.627461 -5301.46062
## flipper_length_mm 45.368485 55.18079
## bill_length_mm -6.354772 14.59262
## bill_depth_mm -7.250491 47.15120
## sigma 365.765804 425.81021
Based on these 95% credible intervals, when controlling for the other predictors in the model, the predictor flipper_length_mm has a significant positive association with body mass (the 95% posterior credible interval for the flipper_length_mm coefficient, (45.368485, 55.18079), is the only one that lies entirely above 0); none of the predictors has significant negative association with body mass since none of the 95% posterior credible intervals for the coefficients lies entirely below 0; predictors bill_length_mm and bill_depth_mm have no significant association with body mass since the bill_length_mm and bill_depth_mm coefficients both have 95% credible intervals which straddle 0. Though both intervals lie mostly above 0, suggesting body mass is positively associated with bill length and bill depth when controlling for the other model predictors, the waffling evidence invites some skepticism.
# Simulating model 1
model_1 <- stan_glm(body_mass_g ~ flipper_length_mm,
data = penguins_bayes,
family = gaussian,
prior_intercept = normal(4000, 250),
prior = normal(0, 2.5, autoscale = TRUE),
prior_aux = exponential(1, autoscale = TRUE),
prior_PD = FALSE,
chains = 4, iter = 5000*2, seed = 616)
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 1.3e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.098408 seconds (Warm-up)
## Chain 1: 0.186717 seconds (Sampling)
## Chain 1: 0.285125 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 4e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.098901 seconds (Warm-up)
## Chain 2: 0.185134 seconds (Sampling)
## Chain 2: 0.284035 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 8e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.08 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.134908 seconds (Warm-up)
## Chain 3: 0.18612 seconds (Sampling)
## Chain 3: 0.321028 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 4e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.11865 seconds (Warm-up)
## Chain 4: 0.187676 seconds (Sampling)
## Chain 4: 0.306326 seconds (Total)
## Chain 4:
# Simulating model 2
model_2 <- stan_glm(body_mass_g ~ species,
data = penguins_bayes,
family = gaussian,
prior_intercept = normal(4000, 250),
prior = normal(0, 2.5, autoscale = TRUE),
prior_aux = exponential(1, autoscale = TRUE),
prior_PD = FALSE,
chains = 4, iter = 5000*2, seed = 616)
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 1.3e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.12539 seconds (Warm-up)
## Chain 1: 0.197494 seconds (Sampling)
## Chain 1: 0.322884 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 4e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.115871 seconds (Warm-up)
## Chain 2: 0.196365 seconds (Sampling)
## Chain 2: 0.312236 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 5e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.140631 seconds (Warm-up)
## Chain 3: 0.197461 seconds (Sampling)
## Chain 3: 0.338092 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 5e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.112404 seconds (Warm-up)
## Chain 4: 0.195717 seconds (Sampling)
## Chain 4: 0.308121 seconds (Total)
## Chain 4:
# Simulating model 3
model_3 <- stan_glm(body_mass_g ~ flipper_length_mm + species,
data = penguins_bayes,
family = gaussian,
prior_intercept = normal(4000, 250),
prior = normal(0, 2.5, autoscale = TRUE),
prior_aux = exponential(1, autoscale = TRUE),
prior_PD = FALSE,
chains = 4, iter = 5000*2, seed = 616)
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 1.3e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.13 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.184409 seconds (Warm-up)
## Chain 1: 0.262448 seconds (Sampling)
## Chain 1: 0.446857 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 4e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.04 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.189054 seconds (Warm-up)
## Chain 2: 0.266162 seconds (Sampling)
## Chain 2: 0.455216 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 3e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.193242 seconds (Warm-up)
## Chain 3: 0.284639 seconds (Sampling)
## Chain 3: 0.477881 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 5e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.05 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.183789 seconds (Warm-up)
## Chain 4: 0.334502 seconds (Sampling)
## Chain 4: 0.518291 seconds (Total)
## Chain 4:
# Simulating model 4
model_4 <- stan_glm(body_mass_g ~ flipper_length_mm + bill_length_mm + bill_depth_mm, data = penguins_bayes,
family = gaussian,
prior_intercept = normal(4000, 250),
prior = normal(0, 2.5, autoscale = TRUE),
prior_aux = exponential(1, autoscale = TRUE),
prior_PD = FALSE,
chains = 4, iter = 5000*2, seed = 616)
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 1.6e-05 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.16 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 1: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 1: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 1: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 1: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 1: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 1: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 1: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 1: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 1: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 1: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 1: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 0.167218 seconds (Warm-up)
## Chain 1: 0.245785 seconds (Sampling)
## Chain 1: 0.413003 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 6e-06 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.06 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 2: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 2: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 2: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 2: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 2: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 2: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 2: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 2: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 2: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 2: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 2: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 0.166517 seconds (Warm-up)
## Chain 2: 0.230941 seconds (Sampling)
## Chain 2: 0.397458 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 3e-06 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.03 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 3: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 3: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 3: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 3: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 3: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 3: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 3: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 3: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 3: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 3: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 3: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 0.14914 seconds (Warm-up)
## Chain 3: 0.244584 seconds (Sampling)
## Chain 3: 0.393724 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'continuous' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 6e-06 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.06 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 10000 [ 0%] (Warmup)
## Chain 4: Iteration: 1000 / 10000 [ 10%] (Warmup)
## Chain 4: Iteration: 2000 / 10000 [ 20%] (Warmup)
## Chain 4: Iteration: 3000 / 10000 [ 30%] (Warmup)
## Chain 4: Iteration: 4000 / 10000 [ 40%] (Warmup)
## Chain 4: Iteration: 5000 / 10000 [ 50%] (Warmup)
## Chain 4: Iteration: 5001 / 10000 [ 50%] (Sampling)
## Chain 4: Iteration: 6000 / 10000 [ 60%] (Sampling)
## Chain 4: Iteration: 7000 / 10000 [ 70%] (Sampling)
## Chain 4: Iteration: 8000 / 10000 [ 80%] (Sampling)
## Chain 4: Iteration: 9000 / 10000 [ 90%] (Sampling)
## Chain 4: Iteration: 10000 / 10000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 0.176295 seconds (Warm-up)
## Chain 4: 0.23504 seconds (Sampling)
## Chain 4: 0.411335 seconds (Total)
## Chain 4:
# Examine 50 of the 20000 simulated samples from 4 models respectively
pp_check(model_1, nreps = 50)
pp_check(model_2, nreps = 50)
pp_check(model_3, nreps = 50)
pp_check(model_4, nreps = 50)
Though the 50 sets of body mass data simulated from each of the 4 models (light blue) tend to capture the general center and spread of the body mass we actually observed (dark blue), but none captures the entire three humps in the penguin weights. Model 1 and model 4 vaguely shows a bimodality while model 2 and model 3 clearly exhibit a bimodality, yet all of them do not exactly reflect the fact that there’s a batch of penguin weights around 3600 grams, one batch around 4500 grams, another batch around 5500. But if we are using the model to predict the center and spread of the tails of the data, model 3 especially does a great job in capturing the things we need. It has density plots that are the relatively consistent among the 4 models. The centers captured by model 1 and model 4 are both a bit left skewed compared to the original data yet model 1 does a better job than model 4 in approximating the center given that some of the simulated data from model 4 deviate more radically from the original data. Among the 4 models, model 2 exhibits the least stable/consistent simulations, the ups and downs of the density plots are more drastic than in the other model.
set.seed(6)
prediction_summary_cv(model = model_1, data = penguins_complete, k = 10)
## $folds
## fold mae mae_scaled within_50 within_95
## 1 1 244.7567 0.6206982 0.5428571 0.8571429
## 2 2 221.2763 0.5559927 0.6470588 0.9705882
## 3 3 223.1449 0.5524455 0.5588235 1.0000000
## 4 4 305.4568 0.7784934 0.3823529 0.9117647
## 5 5 236.6626 0.5876993 0.6470588 0.9411765
## 6 6 235.3445 0.5846819 0.5882353 0.9705882
## 7 7 347.2988 0.8854379 0.3823529 0.9411765
## 8 8 374.3789 0.9642576 0.2941176 0.8823529
## 9 9 242.0851 0.6076695 0.5294118 0.9411765
## 10 10 205.3015 0.5127264 0.6285714 0.9714286
##
## $cv
## mae mae_scaled within_50 within_95
## 1 263.5706 0.6650102 0.520084 0.9387395
prediction_summary_cv(model = model_2, data = penguins_complete, k = 10)
## $folds
## fold mae mae_scaled within_50 within_95
## 1 1 378.0636 0.7947486 0.4285714 1.0000000
## 2 2 219.2386 0.4756386 0.6176471 0.8823529
## 3 3 389.8686 0.8476909 0.3823529 0.9411765
## 4 4 273.7677 0.5856807 0.5588235 0.9705882
## 5 5 388.6667 0.8526164 0.2647059 0.9117647
## 6 6 429.6154 0.9321221 0.2647059 0.9411765
## 7 7 319.4035 0.6777439 0.5000000 1.0000000
## 8 8 220.4525 0.4638089 0.6764706 0.9705882
## 9 9 317.4803 0.6748082 0.5000000 0.9705882
## 10 10 342.4763 0.7329584 0.4285714 1.0000000
##
## $cv
## mae mae_scaled within_50 within_95
## 1 327.9033 0.7037816 0.4621849 0.9588235
prediction_summary_cv(model = model_3, data = penguins_complete, k = 10)
## $folds
## fold mae mae_scaled within_50 within_95
## 1 1 311.9386 0.8284532 0.4000000 0.9714286
## 2 2 149.2133 0.3877813 0.5588235 0.9705882
## 3 3 192.5488 0.4982378 0.6764706 0.9705882
## 4 4 202.6581 0.5350086 0.5294118 0.9705882
## 5 5 271.9371 0.7305004 0.4705882 0.8823529
## 6 6 331.7906 0.8997268 0.3235294 0.9117647
## 7 7 320.5120 0.8568999 0.4117647 0.9705882
## 8 8 253.7694 0.6642534 0.5000000 0.9705882
## 9 9 222.5072 0.5870563 0.6176471 0.9117647
## 10 10 279.9564 0.7313556 0.4571429 0.9714286
##
## $cv
## mae mae_scaled within_50 within_95
## 1 253.6832 0.6719273 0.4945378 0.9501681
prediction_summary_cv(model = model_4, data = penguins_complete, k = 10)
## $folds
## fold mae mae_scaled within_50 within_95
## 1 1 195.9563 0.4868010 0.6571429 0.9714286
## 2 2 222.7693 0.5569325 0.5588235 0.9411765
## 3 3 265.3529 0.6763592 0.5000000 0.9117647
## 4 4 230.7305 0.5728605 0.6176471 0.9705882
## 5 5 298.9037 0.7510093 0.3823529 0.9705882
## 6 6 275.8985 0.6921216 0.5000000 0.9411765
## 7 7 316.6102 0.8161499 0.4705882 0.9117647
## 8 8 287.5275 0.7252071 0.4705882 0.9411765
## 9 9 264.9713 0.6684383 0.5000000 0.9705882
## 10 10 339.9279 0.8628930 0.3714286 0.9428571
##
## $cv
## mae mae_scaled within_50 within_95
## 1 269.8648 0.6808772 0.5028571 0.9473109
By comparison, model_1 has the smallest scaled median absolute error (0.665) and the largest within_50 coverage statistics (0.520) yet smallest within_95 coverage statistics (0.939); model 2 has the largest mae (327.903) and scaled mae (0.704) as well as the smallest within_50 coverage statistics (0.462) yet largest within_95 coverage statistics (0.959); model_3 has the smallest mae (253.683). Model_4 has medium performance on the four metrics.
Putting all together, model_1 and model_3 both appear to provide better posterior predictions of body mass for penguins than the rest of our models – they have relatively smaller mae and scaled mae as well as relatively higher within_50 and within_95 coverage statistics. Though model_1 has the smallest within_95 coverage statistics, the number is not very far from those of the other models). By contrast, though model 2 has the largest within_95 coverage statistics, it also has the worst performance in terms of mae, scaled mae, and within_50 coverage statistics, which makes it outperformed by other three models.
Putting this into context, in utilizing both flipper_length_mm and species predictors or flipper_length_mm predictor alone, model_3 and model_1 convincingly outperform the models which use species predictor alone or flipper_length_mm, bill_length_mm, and bill_depth_mm together (model_2 and model_4 ). It indicates that the flipper_length_mm is a core predictor of the penguins’ body mass, and the species predictor can be an auxiliary predictor (more evaluation needs to be done) whereas the additional two predictors (bill_length_mm and bill_depth_mm) don’t substantially improve our understanding about penguins’ body mass if we already have information about flipper_length_mm and species.
# Calculate ELPD for the 4 models
set.seed(6)
loo_1 <- loo(model_1)
loo_2 <- loo(model_2)
loo_3 <- loo(model_3)
loo_4 <- loo(model_4)
# Results
c(loo_1$estimates[1], loo_2$estimates[1], loo_3$estimates[1], loo_4$estimates[1])
## [1] -2531.351 -2586.175 -2515.609 -2531.580
# Compare the ELPD for the 4 models
loo_compare(loo_1, loo_2, loo_3, loo_4)
## elpd_diff se_diff
## model_3 0.0 0.0
## model_1 -15.7 5.1
## model_4 -16.0 6.4
## model_2 -70.6 9.8
loo() function calculates the expected log-predictive densities (ELPD) for each model. The larger the expected logged posterior predictive pdf across a new set of data points, the more accurate the posterior predictions of the new data, according to which model_3 appears to have the most accurate posterior predictions among the four models. loo_compare() lists the models in order from best to worst, or from the highest ELPD to the lowest: model_3, model_1, model_4, and model_2. The remaining output details the extent to which each model compares to the best model, model_3. For example, the ELPD for model_1 (-2531.351) is estimated to be 15.7 lower (worse) than that of model_3 (-2515.609). Further, this estimated difference in ELPD has a corresponding standard error of 5.1. That is, we believe that the true difference in the ELPDs for model_3 and model_1 is within roughly 2 standard errors of the estimated -15.7 difference (-15.7 ± 2*5.1 = (-25.9, -5.5)).
mass_model_1_df <- as.data.frame(mass_model_1)
View(mass_model_1_df)
mass_model_1_df %>%
mutate(Adelie = `(Intercept)` + flipper_length_mm*189.9536) %>%
mcmc_areas(pars = "Adelie")
data("penguins_bayes")
penguins_bayes %>%
group_by(species) %>%
summarise(mean_Adelie=mean(flipper_length_mm, na.rm = TRUE))
## # A tibble: 3 × 2
## species mean_Adelie
## <fct> <dbl>
## 1 Adelie 190.
## 2 Chinstrap 196.
## 3 Gentoo 217.
I haven’t made the final decision yet (caught a bad cold), but mostly I want to talk about video games in the presentation. I am interested in using the video_games.csv dataset released recently on github. The metascore, average_playtime, price, and owners variables grab my attention. Given the nature of those variables the normal-normal model would be useful to examine relationships between them.